Hidden-Markov-Model Based Speech Enhancement
نویسندگان
چکیده
The goal of this contribution is to use a parametric speech synthesis system for reducing background noise and other interferences from recorded speech signals. In a first step, Hidden Markov Models of the synthesis system are trained. Two adequate training corpora consisting of text and corresponding speech files have been set up and cleared of various faults, including inaudible utterances or incorrect assignments between audio and text data. Those are tested and compared against each other regarding e.g. flaws in the synthesized speech, it’s naturalness and intelligibility. Thus different voices have been synthesized, whose quality depends less on the number of training samples used, but much more on the cleanliness and signal-tonoise ratio of those. Generalized voice models have been used for synthesis and the results greatly differ between the two speech corpora. Tests regarding the adaptation to different speakers show that a resemblance to the original speaker is audible throughout all recordings, yet the synthesized voices sound robotic and unnatural in smaller parts. The spoken text, however, is usually intelligible, which shows that the models are working well. In a novel approach, speech is synthesized using side information of the original audio signal, particularly the pitch frequency. Results show an increase of speech quality and intelligibility in comparison to speech synthesized solely from text, up to the point of being nearly indistinguishable from the original.
منابع مشابه
Speech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملUsing hidden Markov models for speech enhancement
This work presents an approach to speech enhancement that operates using a speech production model to reconstruct a clean speech signal from a set of speech parameters that are estimated from the noisy speech. The motivation is to remove the distortion and residual and musical noises that are associated with conventional filtering-based methods of speech enhancement. The STRAIGHT vocoder forms ...
متن کاملPerformance Analysis of Speech Enhancement Algorithm for Robust Speech Recognition System
Widely Speech Signal Processing has not been used much in the field of electronics and computers due to the complexity and variety of speech signals and sounds with the advent of new technology. However, with modern processes, algorithms, and methods which can proc Demand for speech recognition technology is expected to their mobile phones as all purpose lifestyle devices. In this paper, an imp...
متن کاملComparison of formant enhancement methods for HMM-based speech synthesis
Hidden Markov model (HMM) based speech synthesis has a tendency to over-smooth the spectral envelope of speech, which makes the speech sound muffled. One means to compensate for the over-smoothing is to enhance the formants of the spectral model. This paper compares the performance of different formant enhancement methods, and studies the enhancement of the formants prior to HMM training in ord...
متن کاملAn approach to iterative speech feature enhancement and recognition
In this paper we propose a novel iterative speech feature enhancement and recognition architecture for noisy speech recognition. It consists of model-based feature enhancement employing Switching Linear Dynamical Models (SLDM), a hidden Markov Model (HMM) decoder and a state mapper, which maps HMM to SLDM states. To consistently adhere to a Bayesian paradigm, posteriors are exchanged between th...
متن کاملMinimum cost based phoneme class detection for improved iterative speech enhancement
It is known that degrading acoustic noise innuences speech quality across phoneme classes in a non-uniform manner. This results in variable quality performance for many speech enhancement algorithms in noisy environments. To address this, a hidden-Markov-model phoneme classiica-tion procedure is proposed which directs single channel speech enhancement across individual phoneme classes. The proc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1707.01090 شماره
صفحات -
تاریخ انتشار 2017